Climbing has become extremely popular and I would like to determine where the optimal location would be to open a retail store in Alberta. In this report, I will specifically look at climbing areas in Alberta and cross reference this information against current outdoor retail stores and climbing gyms.
This report will be targeted to individuals and/or corporations looking to open retail stores, guiding services, and/or sell climbing equipment.
There are many popular, well known climbing areas that have retail stores and gyms within their vicinity, so I aim to look at lesser travelled areas in Alberta. The goal is to provide location recommendations to the entrepeneurs.
I will be taking a look at the following factors:
sends is the number of times the climb or problem has been completed, as inputed by the climber
Foursquare also allows user to define venues as climbing gyms and rock climbing spots, using the keys below. I have also gathered the keys for outdoor supply store and sporting goods shop.
Climbing Gym
503289d391d4c4b30a586d6a
Rock Climbing Spot
50328a4b91d4c4b30a586d6b
Sporting Goods Shop
4bf58dd8d48988d1f2941735
Using the climbing gym venue designation, I plan to obtain all the locations of rock climbing gyms in Alberta from Foursquare. Additionally, I will also obtain any rock climbing spot information in Alberta to gain location data. Howevever, the majority of the rock climb location data has been obtained by other sources as mentioned below. As Foursquare isn't likely widely used by climbers to add rock climbing locations, I plan to add the information to Foursquare's database using the post method and add endpoint.
The climbing data has been retrieved from Sendage.com using a PowerShell invoke web request to loop through all the pages of climbs:
for ( $i = 1; $i -lt 411; $i++ ) {
Invoke-WebRequest -Uri "https://sendage.com/api/climbs" -outfile "C:\filepath\file name$i.json" `
-Method "POST" `
-Headers @{
"sec-ch-ua"="`" Not A;Brand`";v=`"99`", `"Chromium`";v=`"90`", `"Google Chrome`";v=`"90`""
"Accept"="application/json, text/javascript, */*; q=0.01"
"X-Requested-With"="XMLHttpRequest"
"sec-ch-ua-mobile"="?0"
"User-Agent"="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36"
"Origin"="https://sendage.com"
"Sec-Fetch-Site"="same-origin"
"Sec-Fetch-Mode"="cors"
"Sec-Fetch-Dest"="empty"
"Referer"="https://sendage.com/search"
"Accept-Encoding"="gzip, deflate, br"
"Accept-Language"="en-US,en;q=0.9"
"Cookie"="__utmz=45372454.1622165290.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __gads=ID=f756a2f56863dba7-22155fcebdc7007a:T=1622165290:RT=1622165290:S=ALNI_MYrTxsUPwHV-wQX4DfpI1gIQukTvA; CakeCookie[FeedType]=Q2FrZQ%3D%3D.; __utma=45372454.403148336.1622165290.1622165290.1622168460.2; __utmc=45372454; __utmt=1; __utmb=45372454.8.10.1622168460"
} `
-ContentType "application/x-www-form-urlencoded; charset=UTF-8" `
-Body "mode=climb&page=$i&term=&areas%5B%5D=5794&area_parents=false&order%5B%5D=sends+DESC&rating=0&sends=0&limit=15&types%5Bb%5D%5Bon%5D=1&types%5Bs%5D%5Bon%5D=1&types%5Bt%5D%5Bon%5D=1"
}
The area information was gathered the same way, using the below request:
Invoke-WebRequest -Uri "https://sendage.com/areas/get_bounded?n=50.9074266406351&e=-114.72324695492293&s=50.85327246875764&w=-114.88460864925887&zoom=13" -outfile "C:\filepath\ab_map1.json" -Headers @{
"sec-ch-ua"="`" Not;A Brand`";v=`"99`", `"Google Chrome`";v=`"91`", `"Chromium`";v=`"91`""
"Accept"="application/json, text/javascript, */*; q=0.01"
"X-Requested-With"="XMLHttpRequest"
"sec-ch-ua-mobile"="?0"
"User-Agent"="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
"Sec-Fetch-Site"="same-origin"
"Sec-Fetch-Mode"="cors"
"Sec-Fetch-Dest"="empty"
"Referer"="https://sendage.com/areas"
"Accept-Encoding"="gzip, deflate, br"
"Accept-Language"="en-US,en;q=0.9"
"Cookie"="__gads=ID=f756a2f56863dba7-22155fcebdc7007a:T=1622165290:RT=1622165290:S=ALNI_MYrTxsUPwHV-wQX4DfpI1gIQukTvA; __utmz=45372454.1624109129.6.2.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); _ga=GA1.2.403148336.1622165290; __utmc=45372454; SendageSession=41aqan1n07lgg78j08fmegjhr5; wordpress_test_cookie=WP+Cookie+check; JCS_INENREF=https%3A//sendage.com/area/ab-canada; JCS_INENTIM=1626439801744; _wpss_h_=5; _wpss_p_=N%3A3%20%7C%20WzFdW0Nocm9tZSBQREYgUGx1Z2luXSBbMl1bQ2hyb21lIFBERiBWaWV3ZXJdIFszXVtOYXRpdmUgQ2xpZW50XSA%3D; PHPSESSID=dfvg7adirhr7ps40fr4crnr985; wordpress_logged_in_3811572f777979037a3eea8b7b1ddda7=ryanclarke%7C1626612629%7C2glYwOCoiueeSa4q3N9Nty4smAIrz0meNcYsvEXj5Wi%7Cfb79812e16e132753fe491cf3fbbd3caaba126527ccd091f3df4cc7d48203472; CakeCookie[FeedType]=Q2FrZQ%3D%3D.5gAkJ1gA%2BpI%3D; __utma=45372454.403148336.1622165290.1626439478.1626443256.28; __utmt=1; __utmb=45372454.1.10.1626443256"
}
Although some area information could be downloaded, I still had to extract location data from other websites. The location data is derived from Sendage.com, thecrag.com, and mountainproject.com.
Notes
I can then work off the main dataset called 'climbs'.
Here we start importing libraries required to process and analyze the data.
## import all the required libraries
import requests
import pandas as pd
import numpy as np
import random
import folium
import json
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors
from bs4 import BeautifulSoup
from geopy.geocoders import Nominatim
from IPython.display import Image
from IPython.core.display import HTML
from pandas.io.json import json_normalize
from sklearn.cluster import KMeans
from sklearn.datasets.samples_generator import make_blobs
from folium import plugins, FeatureGroup, LayerControl, Map, Marker
from folium.plugins import MarkerCluster
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
%matplotlib inline
print('Libraries imported.')
The climbs data was normalized and consolidated using the code below. The loop iterated through each page, normalizes the data and then appends to a master df, which is then converted to a CSV. The data was converted to a CSV file to allow for better cleaning of the data before the analysis can be completed. The data has been cleaned in Excel and then re-exported as a CSV file.
ab_dfmain = pd.DataFrame(data=None, index=None, columns=None, dtype=None, copy=None)
i = 1
while i < 6163:
with open('C:\\Users\\clark\\Documents\\Coursera Capstone Project\\Project Files\\Climbing Data\\Sendage Data\\Alberta Climbing Data\\alberta_climbs_pg%i.json' %i, 'r') as myfile:
climbs=myfile.read()
ab = json.loads(climbs)
ab_df = json_normalize(ab['climbs'])
ab_dfmain = ab_dfmain.append(ab_df, ignore_index=True)
ab_dfmain.to_csv('Alberta_Climbs')
i += 1
The cleaned climbs file is then converted back to a pandas dataframe.
The areas file also needs to be consolidated and cleaned. Below is the code to consolidate the files. The csv file is then cleaned in Excel.
ab_areas_main = pd.DataFrame(data=None, index=None, columns=None, dtype=None, copy=None)
i = 1
while i < 19:
with open('C:\\Users\\clark\\Documents\\Coursera Capstone Project\\Project Files\\Climbing Data\\Sendage Data\\Alberta Climbing Data\\ab_map%i.json' %i, 'r') as myfile:
areas=myfile.read()
ab_areas = json.loads(areas)
ab_areas_df = json_normalize(ab_areas)
ab_areas_main = ab_areas_main.append(ab_areas_df, ignore_index=True)
ab_areas_main.to_csv('Alberta_Climbs_Areas.csv')
i += 1
The main .csv file is loaded with all the climb and location information.
climbs = pd.read_csv('C:\\Users\\clark\\Documents\\Coursera Capstone Project\\Project Files\\Alberta_Climbs_Cleaned_Main.csv', sep=',', header=0, engine='python')
Now it is time to gather the Foursquare data. I first set up the call for climbing gym's close to Calgary, Canmore, Edmonton, Lethbridge, and Banff.
I first set up my Foursquare credentials.
## foursquare credentials
CLIENT_ID = 'XTPUCP3CPMDLIPG1NR0OVLGPHGCXUJK5FUURVCGVUWXXOS5C'
CLIENT_SECRET = 'CO51X1ENAB4NZUCTLJTP5THPUHZWGSSZHPQ10AIXTK1CYEPZ' # your Foursquare Secret
ACCESS_TOKEN = 'SBPVG2UII0D1IE22WZ3TKXWEADKTXXS2OQBWZTJ3YQYTYKVY' # your FourSquare Access Token
VERSION = '20210713'
LIMIT = 100 # DEFINE LIMIT
RADIUS = 1000 # define radius
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
# set up df for Foursquare calls
cities = pd.DataFrame({
"city" : ["Calgary","Edmonton", "Lethbridge", "Canmore", "Banff"],
"lat" : [51.04523846324835, 53.55542738125147, 49.69386687166893, 51.08928664289112, 51.17690773491984],
"lng" : [-114.07168645706562, -113.49243690514189, -112.85376654838292,-115.34369824408034,-115.57244784758215]})
cities.set_index('city')
Using the Foursquare API, I will gather all the gym information.
## foursquare calls
# gyms
category = "503289d391d4c4b30a586d6a" # CLIMBING GYM
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 10000 # define radius
## Calgary Call
url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&near={}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
"Calgary",
category,
radius,
LIMIT)
url # display URL
results = requests.get(url).json()
## Edmonton Call
url1 = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&near={}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
"Edmonton",
category,
radius,
LIMIT)
url1 # display URL
results1 = requests.get(url1).json()
Looks like the new bouldering specific climbing gym, as well as the popular Elevation Place are not listed in Canmore. I will add them later on below.
venues = json_normalize(results['response']['venues'])
venues1 = json_normalize(results1['response']['venues'])
venues = venues.append(venues1, sort=False)
venues = venues.drop(['id','categories', 'referralId', 'hasPerk', 'location.labeledLatLngs', 'location.cc', 'location.state'], axis=1)
venues
The table above shows all the climbing gyms in Alberta but has information that is not required. I will go ahead and remove the columns that aren't required by making a new df.
ab_venues = venues[['name', 'location.lat', 'location.lng']]
ab_venues
Some climbing gym information did not show up in the Foursquare data, so they are added below.
ab_venues = ab_venues.append({"name":"Elevation Place", "location.lat":51.088826750378715,"location.lng":-115.35085760398437},
ignore_index = True)
ab_venues = ab_venues.append({"name":"Canmore Climbing Gym", "location.lat":51.0944466842228,"location.lng":-115.35813175027772},
ignore_index = True)
ab_venues = ab_venues.append({"name":"Vertical Addiction", "location.lat":51.093543768388244,"location.lng":-115.35814248361741},
ignore_index = True)
ab_venues = ab_venues.append({"name":"Coulee Climbing", "location.lat":49.69607730861128,"location.lng":-112.81831293543736},
ignore_index = True)
ab_venues = ab_venues.append({"name":"Ascent Climbing Centre", "location.lat":49.67710003598693,"location.lng":-112.86533079367096},
ignore_index = True)
ab_venues
Now I can start gathering the retail store data that specifically sells climbing equipment. This data will be extracted from Foursquare using the sporting goods shop and outdoor retailer category id's.
# retail stores
category = "4bf58dd8d48988d1f2941735" # Sporting Goods Stores
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 10000 # define radius
## Calgary Call
url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&near={}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
"Calgary",
category,
radius,
LIMIT)
url # display URL
store_results = requests.get(url).json()
## Edmonton Call
url1 = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&near={}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
"Edmonton",
category,
radius,
LIMIT)
url1 # display URL
store_results1 = requests.get(url1).json()
## Canmore Call
url2 = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&near={}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
"Canmore",
category,
radius,
LIMIT)
url2 # display URL
store_results2 = requests.get(url2).json()
## Banff Call
url3 = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&near={}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
"Banff",
category,
radius,
LIMIT)
url3 # display URL
store_results3 = requests.get(url3).json()
## Lethbridge Call
url4 = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&near={}&categoryId={}&radius={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
"Lethbridge",
category,
radius,
LIMIT)
url4 # display URL
store_results4 = requests.get(url4).json()
The data is then normalized.
stores = json_normalize(store_results['response']['venues'])
stores1 = json_normalize(store_results1['response']['venues'])
stores2 = json_normalize(store_results2['response']['venues'])
stores3 = json_normalize(store_results3['response']['venues'])
stores4 = json_normalize(store_results4['response']['venues'])
stores = stores.append(stores1, sort=False)
stores = stores.append(stores2, sort=False)
stores = stores.append(stores3, sort=False)
stores = stores.append(stores4, sort=False)
ab_stores = stores[['name', 'location.lat', 'location.lng']]
ab_stores.head()
Unfortunately this dataset includes more stores that do not sell climbing equipment than ones that do. I will only select the stores that sell climbing equipment.
climbing_stores = ab_stores[ab_stores['name'] == "atmosphere"]
climbing_stores = climbing_stores.append(ab_stores[ab_stores['name'] == "Atmosphere Edmonton Eaton Centre"])
climbing_stores = climbing_stores.append(ab_stores[ab_stores['name'] == "Vertical Addiction"])
climbing_stores = climbing_stores.append(ab_stores[ab_stores['name'] == "Gearup Mountain Sport & Rentals"])
climbing_stores = climbing_stores.append(ab_stores[ab_stores['name'] == "Manod"])
climbing_stores = climbing_stores.append(ab_stores[ab_stores['name'] == "Atmosphere Banff"])
climbing_stores = climbing_stores.append(ab_stores[ab_stores['name'] == "MEC Calgary"])
climbing_stores = climbing_stores.append(ab_stores[ab_stores['name'] == "Awesome Adventures"])
climbing_stores
I plotted all the climbs on a map to test out the data and it looks pretty good. I also played around with Folium and clustered the markers to speed up the display of the map.
# basic map with all the climbs
alberta_climbs_all = folium.Map(
location=[50.9199388585142, -113.98640435217936],
zoom_start=7,
tiles="Stamen Terrain",
control_scale=True)
areas_marker_cluster = MarkerCluster().add_to(alberta_climbs_all)
# add markers to map
for lat, lng, climb, climb_type, grade in zip(climbs['lat'], climbs['lon'], climbs['climb'], climbs['type'], climbs['grade_norm']):
label = 'Name of Climb: {}, Climb Type: {}, Climb Grade: {}'.format(climb, climb_type, grade)
label = folium.Popup(label, parse_html=True)
folium.Marker(
[lat, lng],
popup=label,
icon=folium.Icon(color='blue')).add_to(areas_marker_cluster)
alberta_climbs_all